Feature-preserving simplification framework for 3D point cloud

To obtain a higher simplification rate while retaining geometric features, a simplification framework for the point cloud is proposed. Firstly, multi-angle images of the original point cloud are obtained with a virtual camera. Then, feature lines of each image are extracted by deep neural network. Furthermore, according to the proposed mapping relationship between the acquired 2D feature lines and original point cloud, feature points of the point cloud are extracted automatically. Finally, the simplified point cloud is obtained by fusing feature points and simplified non-feature points. The proposed simplification method is applied to four data sets and compared with the other six algorithms. The experimental results demonstrate that our proposed simplification method has the superiority in terms of both retaining geometric features and high simplification rate.

The previous research on point cloud simplification methods can be generally classified into two kinds. One is mesh-based and the other is scattered-point-based. The mesh-based methods convert the point cloud to the mesh model with polygons, and then reduce the points based on specific rules for simplification. Hamann 19 developed an algorithm to iteratively delete triangles according to the triangulation, and it had a good effect on the model surface. Lounsbery et al. 20 simplified the connected triangular mesh through wavelet representation. Weir et al. 21 proposed a simplification algorithm based on the bounding box. The algorithm constructed a bounding box that surrounded all the points of the 3D model, and the bounding box was divided into several small cubes evenly, and the closest point from the center of the small cube replaced all other points. The implement is simple and easy, however, different point clouds need different division scales, and the simplification accuracy cannot be guaranteed. Kalvin and Taylor 22 proposed a bounded approximation method, which placed the vertices of the polyhedral mesh in an error region, and this method improved the practical feasibility. Gong et al. 23 combined voxel grid with the bounding box, and confirmed the center of each small grid by calculating the distance of k neighborhood and the normal for simplification. This method is apt to discard feature points in the area where the curvature changes. The main disadvantages of these mesh-based methods are that they are seriously complex and building polygonal structural meshes require a great quantity of extra information and memory space. In contrast, the scattered-point-based methods can consume the point cloud directly. These methods can mainly be summarized into two categories. One is based on the global ideology for simplification, and the other is based on the partition strategy, extracting the feature points before simplification. Song and Feng 24 reduced the points globally according to the specified simplification ratio. Shi et al. 25 proposed a simplification method based on k-means clustering. Xiao and Huang 26 proposed a kd-tree-based method that uniformly simplified the point cloud. Although the simplification efficiency is high, the threshold needs to be adjusted according to the specific model. Zin et al. 27 presented a simplification method based on the unit normal vector. The feature points were extracted by constructing boundary spheres to search for k nearest neighbors and measuring the curvature of each point. Wei 28 established the tangent plane based on the least squares fitting method 29 , analyzed the geometric distribution characteristics of the points on the projection surface according to the relationship between the points, and detected the edge feature points. Zanger et al. 30 presented a multi-level method for preserving geometric features of different scales. Han et al. 31 proposed an edge-preserving algorithm. Elkhrachy 32  www.nature.com/scientificreports/ threshold to determine the edge points. Chen and Sun 33 proposed a method that divided the original point cloud into spaces, constructed the k neighborhood of the point, set parameters of features for analysis, and combined the local average distance with the contours of the edge points for classification. The prime shortage of the scattered-point-based methods is that they neglect the intrinsic correlation between points of the point cloud owing to their lack of topological structures, resulting in the loss of some significant geometric characteristics.
Here, a novel simplification framework for the point cloud is presented. It consists of two parts, the feature points and the simplified non-feature points. Feature points, which are vital for representing the geometric features of the point cloud, are extracted through three steps. The first is obtaining multi-angle images of the point cloud; the second is extracting feature lines of each image; the third is obtaining the feature points from the original point cloud based on the extracted feature lines. The simplified non-feature points, which are utilized for filling the flat areas of the 3D model to maintain its integrity, are extracted from the subset of the point cloud except for feature points. The flowchart of the proposed framework is shown in Fig. 1.
Experimental results demonstrate that the proposed framework can achieve a simplified point cloud with high quality. The main contributions of this work can be summarized as: (1) A feature-line based framework is proposed for point cloud simplification. Inspired by the success of deep learning in extracting critical image features, a method that transforms 3D point cloud to 2D images is proposed to better learn the characteristics of the point cloud. (2) The mapping relationship between the images and the point cloud is presented. According to the correspondence between the point cloud and images, one pixel in the image is related to a group of 3D points (one-to-k), and a novel method is proposed to receive the final correspondence (one-to-one).  The choice of the axis and angle for capturing 2D images. In order to analyze the influence of the coordinate axis, the experiment fixes the rotation angle at 60° to extract the feature points of the X-axis, X/Y-axis, and X/Y/Z-axis respectively. The results of different feature points are shown in Table 1: Figures 2, 3, 4 and 5 illustrates the point cloud with different numbers of the feature points extracted with different axes for Bunny, Elephant, Gargo50k and Horse. For example, in Fig. 2, X_1657 represents that the original point cloud model is rotated around the X-axis to capture the corresponding 2D images, and 1657 is the feature points extracted from the original point cloud based on the mapping relationship between 2D images to the 3D model. X/Y_3126 shows that the 3D model is rotated around the X-axis and Y-axis to capture the 2D images, Table 1. Results of different feature points under X-axis, X/Y-axis, and X/Y/Z-axis with 60°. Bunny  1657  3126  3982  35,944   Elephant  1162  2423  3347  24,950   Gargo50k  958  1580  2166  25,036   Horse  1883  3401  4324 Table 1 and Figs. 2, 3, 4 and 5, it can be known that with the increase of information provided by the X-axis, X/Y-axis, and X/Y/Z-axis, the number of feature points increases significantly, and the corresponding information increments provided by the X-axis and X/Y-axis are obviously more than that of the X/Y/Z-axis. It is obvious that Stanford's model is regular based on the X/Y/Z-axis. The information of most points is concentrated on the X-axis and Y-axis, and the information on the Z-axis is naturally relatively small. However, as most of the models do not satisfy the standard X/Y-axis coordinates, the information on the Z-axis is also very important, and the Z-axis information should be retained. Moreover, under the unified mode of X/Y/Z-axis, not only the integrity of the 3D point cloud feature information is guaranteed, but also the irregular model does not need to be initialized, eliminating some troublesome preprocessing processes.

X-axis X/Y-axis X/Y/Z-axis Original model
To analyze the impact of different rotation angles, the experiment fixed the rotation axis with the X/Y/Z-axis to extract the feature points of the angles 90°, 60°, 45°and 30° respectively. The results of different feature points are shown in Table 2:  Table 2 shows that as the angle gradually increases, the information of the corresponding captured image is less, and the feature points extracted from the original point cloud are also fewer. Therefore, the smaller the angle is, the more the feature points are, and the higher the fineness of the image would be. However, with the   www.nature.com/scientificreports/    www.nature.com/scientificreports/ number of images increasing, the information redundancy gets worse, resulting in time-consuming and complex calculation to achieve the goals. Therefore, it is essential to balance the rotation angles and the number of images. From the Figs. 6, 7, 8 and 9, it can be found that the result with rotation angle 60° not only satisfies the geometric characteristics retaining of the point cloud compared to the results of 90°, but also reduces time consumption compared to the results of 45° and 30°. Therefore, X/Y/Z-axis and 60° are to be the best choice of the axis and angle for capturing 2D images.
The choice of parameters for extracting feature points of the 3D point cloud. Based on the predicted feature images, there are three parameters α , β and γ together constraining the feature point extraction effect. α controls the threshold of the predicted grayscale feature image to decide whether each pixel belongs to the feature line. β controls the width of the feature line. γ controls the spatial size of each feature pixel in the 2D image corresponding to the 3D model. The influence of different parameters for extracting feature points of the 3D point cloud (bunny for instance) is shown in Table 3: Table 3 shows that different parameters have different effects on extracting the feature points. The change of α has little effect on the number of feature points, while β and γ bring great difference. Figure 10 shows the feature extraction results of bunny with approximate feature points.
It can be seen that the point clouds are too sparse in Fig. 10e-g, and the contours are not complete enough to cover all feature points. However, (a) has too many points and does not meet the requirements of simplification. The ears in (c) and (d) are not perfect, and there are holes in both. In contrast, (b) shows best among them, not only retaining the contour points, but also showing more details such as ears, neck and bottom. Therefore, we select the parameters with the number of feature points around 4000. There are three groups of such parameters in Table 3, α = 0.2, β = 2, γ = 0.33 ; α = 0.23, β = 2, γ = 0.33 and α = 0.25, β = 2, γ = 0.33 . A series of experiments on other models with these parameters are carried out and it indicates that the results are most stable with parameters α = 0.23, β = 2, and γ = 0.33 . Therefore, we adopt α = 0.23, β = 2, and γ = 0.33 as the optimal ones. The feature point extraction results obtained with these optimal parameters are shown in Fig. 11.

Results
To show the simplification results more intuitively, the simplified point cloud is reconstructed to the 3D model. To further verify the superiority, our method is compared with other existed ones. Figure 12 shows the reconstruction results of the simplified bunny. (a) Original data, total number of the points = 35,947, (b) Our method, total number of the points = 7000, (c) The simplified method based on DFPSA, total number of the points = 6730, (d) The simplified method based on Gaussian spheres, total number of the points = 8491, (e) The simplified method based on octree coding, total number of the points = 3005, (f) The k-means clustering simplification method, total number of the points = 17,385, (g) The uniform simplification method, total number of the points = 4539, (h) The geometric algebra method, total number of the points = 5434. Figure 12 reveals that the simplification methods based on Gaussian spheres, octree coding, k-means, uniform simplification, and geometric algebra result in serious holes at the end of bunny's ears. For DFPSA, a pair of ears looks complete and has the same outline as the original model, but there are still some holes on the corners. Our method looks smoother and the total number of simplified points is similar with that of DFPSA. Figure 13 shows   www.nature.com/scientificreports/ k-means clustering simplification method, total number of the points = 15,833, (g) The uniform simplification method, total number of the points = 2851, (h) The geometric algebra method, total number of the points = 3887. Figure 13 illustrates that the simplification method based on octree coding and the uniform simplification method have seriously poor results in reconstruction. Not only are there large areas of holes, but also many other details lose. Except for the result of our method in (b), the other four methods have some small holes in nose, body or other parts. The reconstruction result generated by our method is nearly consistent with the original point cloud model and has fewer number of simplified points.
Due to the obvious asymmetry of the gargo50k, the front and the back side are both used to reconstruct 3D models for comparison explanation. The first row shows the front reconstruction results and the second row shows the back reconstruction results in Figs. 14 and 15.
The reconstruction results of gargo50k with different simplified methods are shown in Figs. 14 and 15. The reconstruction results are not promising based on the octree coding and the uniform simplification method. There are large holes at the wing of gargo50k model. The simplified method based on k-means clustering has bad simplification rate, and the base at the back side still has holes. The methods based on the Gaussian spheres and geometric algebra also lead to a lot of holes, and there are vacancies on the front and back. Except for some small blanks, the DFPSA-based method has almost the same effect as our method proposed in this work. The overall analysis shows that the result generated by our method is superior to others. Figure 16 shows   www.nature.com/scientificreports/ In Fig. 16, the octree coding method and the uniform simplification method have high simplification rate, but many details of legs are lost, not only large holes, even faults. The method based on k-means clustering shows obvious details losing on the back of horse. The simplification results based on Gaussian spheres, DFPSA, and geometric algebra are almost the same as our method, and they are very close to the result of original data, but the total number of simplified point cloud by our method is fewer.
From the results of different simplification methods, we find that the simplification method based on octree coding and the uniform simplification method only retain the limited contours because of fewest points extracted, and there are holes in the model, and the simplification results are not good; the method based on geometric algebra has smaller number of simplified points, but in addition to the horse model, other models all show detail features losing. The k-means clustering method retains the detailed features to a large extent, but the simplification rate is lower, and some non-feature points also have holes, which makes the clustering method not ideal. Compared to the Gaussian sphere method and the DFPSA method, they maintain the surface feature contours, but our method has a higher simplification rate and has a better fitting effect according to the reconstruction results.

Conclusion
Point cloud simplification plays a very important role in 3D data processing. One of the most important principles of point cloud simplification is to reduce the number of points as much as possible without affecting the reconstruction effect obviously. In this paper, a novel feature-preserving point cloud simplification framework is developed. It takes the advantages of deep learning in images and retains the geometric features and the potential surface of the point cloud, with higher reconstruction quality and fewer point numbers. The experimental results demonstrate that the proposed method is more universal to different models than other algorithms, and can better express the geometric appearance and detailed features of the 3D model. On the premise of the integrity of the model, our method can reach the highest simplification rate of the same point cloud. As for future work, the self-adaptive parameters should be developed. We hope this work can provide a useful data preprocessing tool for 3D model digitization.

Methods
Acquisition of the multi-angle 2D images. 2D images are the projection of the 3D point cloud on a certain cross-section. To accurately describe the shape of the point cloud model with 2D images, the model needs to be rotated with multi-angles to obtain different images. Here, by writing a script file for the point cloud processing software (Geomagic Wrap), multi-angle 2D images are obtained by performing single axis variation of the model. It should be noted that the model is rotated around X-axis, Y-axis, and Z-axis respectively.
As shown in Fig. 17, different axes provide different positions, and different rotation angles of the same axis also bring differences in feature points. The selecting of the rotating axis and the angle will be discussed later.  Fig. 18. The network is mainly divided into two parts: one for feature extracting and the other for feature synthesis. The feature extracting module is modified from the VGG-16 40 . The training dataset S is S = {(X n , Y n ), n = 1, . . . , N},X n = {X i , i = 1, . . . , N} represents the input images of the network, Y n = {Y i , i = 1, . . . , N} represents the binary labels of X n , Y n ∈ {0 , 1} , and N refers to the number of input images. The dataset is fed into the network with 13 convolutional layers, 3 fully connected layers, and 5 pooling layers. The network has five stages of the side output for feature extracting and is deep supervised for each stage and the final fusion.
The feature synthesis module is to fuse feature maps of each stage. As shown in Fig. 18 α k in Eq. (2) is the weight parameter, and l (k) stage denotes the loss function for the side output of stage k . The network is deep supervised and image-to-image training. All losses are trained equally and simultaneously. L total is minimized via standard (back-propagation) stochastic gradient descent to achieve the promising effect.
Extraction of the feature points. According to the ideology of normalization, a mapping relationship is established between the feature lines of the 2D image and the feature points of 3D point cloud as follows.
(1) L total = L stage (W, w) + L fuse (W, h), Figure 17. Schematic diagram of bunny model rotating around X/Y/Z axis. www.nature.com/scientificreports/ x 3 and y 3 represent the values of X-axis and Y-axis coordinate of the 3D point cloud. x ′ 3 and y ′ 3 are the 3D coordinates of the candidate feature point. x 2 and y 2 represent the values of the X-axis and Y-axis coordinate of the 2D image. max (x 3 ) and max y 3 refer to the maximum values of the X-axis and Y-axis coordinate of the 3D point cloud, min(x 3 ) and min y 3 are the corresponding minimum values. m and n represent the length and width of the image. Based on mapping relationship in Eq. (3) and Eq. (4), the coordinates of X-axis and Y-axis for the candidate feature point can be roughly conformed. However, the coordinate of Z-axis cannot be determined directly. Here, we utilize (x ′ 3 , y ′ 3 ) as the centroid to expand the filtering range of the surrounding point cloud. Therefore, the Eqs.3 and 4 can be updated to Eqs.5 and6. In this way, a point set P contained a series of candidate points is obtained from the point cloud.
γ is the expansion coefficient, which is used to control the number of candidate points. To characterize the point cloud as much as possible, the feature image obtained from the aforementioned network needs to be processed more finely. There are two main parameters α and β to control the threshold of grayscale image and the boldness of the feature line. Three parameters α , β and γ together control the quality of feature point acquisition.
In the point cloud, the same coordinates of X-axis and Y-axis often have multiple values of Z-axis. When selecting key points, all points in candidate point set P are calculated to obtain the average z . As the 2D images are captured from the front side, that is, they are always at the positive direction of the Z-axis, and the corresponding Z-axis coordinate must be greater than the average z . Based on this, we filter out the points with Z-axis coordinate less than the average z in point set P. In the remaining candidate point set P' , the point with the smallest variance is the corresponding feature point. Finally, the whole feature points of the point cloud can be extracted.
(4) y ′ 3 = max y 3 − min y 3 n * y 2 + min y 3 www.nature.com/scientificreports/ Non-feature point simplification. As the non-feature points have a small amount of point cloud model information, they can be simplified to a great extent. The simplification method should be convenient and with low time complexity. For this purpose, an octree coding method is used for simplifying the non-feature points. The whole flowchart of the non-feature point simplification is shown in Fig. 19. The octree coding construction consists of three steps. (1) Initialization: the maximum recursion depth, the maximum scale of the non-feature points, and the first cube. (2) The elements (here are points) are put into the cube without child node. (3) If the maximum depth is not reached, the octree coding subdivides the cube continuously, until the number of elements allocated to the child cube is not zero and is the same as the parent www.nature.com/scientificreports/ cube. The octree coding method is convenient and error-free when searching for voxels and corresponding points in voxels, and easy to balance the display accuracy and speed because of its orderly and layered characteristic. As the non-feature points in the same leaf node are spatially close, we pay more attention to the relationship between the point and the whole leaf node. The average normal vector and the average curvature are used to describe the whole leaf node. In each octree leaf node, for all points located in it, the average normal vector n avg and the average curvature c avg of these points are calculated. Then, the difference between the normal vector n of each point and n avg and the difference between the curvature c of each point and c avg are calculated respectively. Moreover, the difference of normal vector and the curvature are added up. Finally, the point with the smallest sum value is selected to replace the other points in the leaf node.
Finally, the simplified point cloud can be obtained by fusing the feature points and non-feature points.